Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Cancer Res Commun ; 4(4): 1041-1049, 2024 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-38592452

RESUMO

Cancer research is dependent on accurate and relevant information of patient's medical journey. Data in radiology reports are of extreme value but lack consistent structure for direct use in analytics. At Memorial Sloan Kettering Cancer Center (MSKCC), the radiology reports are curated using gold-standard approach of using human annotators. However, the manual process of curating large volume of retrospective data slows the pace of cancer research. Manual curation process is sensitive to volume of reports, number of data elements and nature of reports and demand appropriate skillset. In this work, we explore state of the art methods in artificial intelligence (AI) and implement end-to-end pipeline for fast and accurate annotation of radiology reports. Language models (LM) are trained using curated data by approaching curation as multiclass or multilabel classification problem. The classification tasks are to predict multiple imaging scan sites, presence of cancer and cancer status from the reports. The trained natural language processing (NLP) model classifiers achieve high weighted F1 score and accuracy. We propose and demonstrate the use of these models to assist in the manual curation process which results in higher accuracy and F1 score with lesser time and cost, thus improving efforts of cancer research. SIGNIFICANCE: Extraction of structured data in radiology for cancer research with manual process is laborious. Using AI for extraction of data elements is achieved using NLP models' assistance is faster and more accurate.


Assuntos
Trabalho de Parto , Neoplasias , Radiologia , Humanos , Gravidez , Feminino , Inteligência Artificial , Estudos Retrospectivos , Processamento de Linguagem Natural , Neoplasias/diagnóstico por imagem
2.
Genome Biol Evol ; 6(1): 76-93, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24407854

RESUMO

Symbiotic associations between animals and microbes are ubiquitous in nature, with an estimated 15% of all insect species harboring intracellular bacterial symbionts. Most bacterial symbionts share many genomic features including small genomes, nucleotide composition bias, high coding density, and a paucity of mobile DNA, consistent with long-term host association. In this study, we focus on the early stages of genome degeneration in a recently derived insect-bacterial mutualistic intracellular association. We present the complete genome sequence and annotation of Sitophilus oryzae primary endosymbiont (SOPE). We also present the finished genome sequence and annotation of strain HS, a close free-living relative of SOPE and other insect symbionts of the Sodalis-allied clade, whose gene inventory is expected to closely resemble the putative ancestor of this group. Structural, functional, and evolutionary analyses indicate that SOPE has undergone extensive adaptation toward an insect-associated lifestyle in a very short time period. The genome of SOPE is large in size when compared with many ancient bacterial symbionts; however, almost half of the protein-coding genes in SOPE are pseudogenes. There is also evidence for relaxed selection on the remaining intact protein-coding genes. Comparative analyses of the whole-genome sequence of strain HS and SOPE highlight numerous genomic rearrangements, duplications, and deletions facilitated by a recent expansion of insertions sequence elements, some of which appear to have catalyzed adaptive changes. Functional metabolic predictions suggest that SOPE has lost the ability to synthesize several essential amino acids and vitamins. Analyses of the bacterial cell envelope and genes encoding secretion systems suggest that these structures and elements have become simplified in the transition to a mutualistic association.


Assuntos
Adaptação Fisiológica , Enterobacteriaceae/genética , Evolução Molecular , Genoma Bacteriano , Simbiose/genética , Animais , Sequência de Bases , Besouros/microbiologia , Dados de Sequência Molecular
3.
PLoS Genet ; 8(11): e1002990, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23166503

RESUMO

Despite extensive study, little is known about the origins of the mutualistic bacterial endosymbionts that inhabit approximately 10% of the world's insects. In this study, we characterized a novel opportunistic human pathogen, designated "strain HS," and found that it is a close relative of the insect endosymbiont Sodalis glossinidius. Our results indicate that ancestral relatives of strain HS have served as progenitors for the independent descent of Sodalis-allied endosymbionts found in several insect hosts. Comparative analyses indicate that the gene inventories of the insect endosymbionts were independently derived from a common ancestral template through a combination of irreversible degenerative changes. Our results provide compelling support for the notion that mutualists evolve from pathogenic progenitors. They also elucidate the role of degenerative evolutionary processes in shaping the gene inventories of symbiotic bacteria at a very early stage in these mutualistic associations.


Assuntos
Bactérias , Evolução Biológica , Interações Hospedeiro-Parasita/genética , Insetos/genética , Simbiose , Animais , Bactérias/genética , Bactérias/patogenicidade , Enterobacteriaceae/genética , Evolução Molecular , Humanos , Dados de Sequência Molecular , Moscas Tsé-Tsé/genética , Moscas Tsé-Tsé/microbiologia
4.
Hum Mutat ; 32(3): 299-308, 2011 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-21972111

RESUMO

Nonsense mutations are usually predicted to function as null alleles due to premature termination of protein translation. However, nonsense mutations in the DMD gene, encoding the dystrophin protein, have been associated with both the severe Duchenne Muscular Dystrophy (DMD) and milder Becker Muscular Dystrophy (BMD) phenotypes. In a large survey, we identified 243 unique nonsense mutations in the DMD gene, and for 210 of these we could establish definitive phenotypes. We analyzed the reading frame predicted by exons flanking those in which nonsense mutations were found, and present evidence that nonsense mutations resulting in BMD likely do so by inducing exon skipping, confirming that exonic point mutations affecting exon definition have played a significant role in determining phenotype. We present a new model based on the combination of exon definition and intronic splicing regulatory elements for the selective association of BMD nonsense mutations with a subset of DMD exons prone to mutation-induced exon skipping.


Assuntos
Códon sem Sentido , Distrofina/genética , Éxons , Distrofia Muscular de Duchenne/genética , Splicing de RNA , Feminino , Humanos , Masculino , Distrofia Muscular de Duchenne/metabolismo , Fenótipo , Splicing de RNA/genética
5.
Neuromuscul Disord ; 20(8): 499-504, 2010 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-20630757

RESUMO

Manifesting carriers of DMD gene mutations may present diagnostic challenges, particularly in the absence of a family history of dystrophinopathy. We review the clinical and genetic features in 15 manifesting carriers identified among 860 subjects within the United Dystrophinopathy Project, a large clinical dystrophinopathy cohort whose members undergo comprehensive DMD mutation analysis. We defined manifesting carriers as females with significant weakness, excluding those with only myalgias/cramps. DNA extracted from peripheral blood was used to study X-chromosome inactivation patterns. Among these manifesting carriers, age at symptom onset ranged from 2 to 47 years. Seven had no family history and eight had male relatives with Duchenne muscular dystrophy (DMD). Clinical severity among the manifesting carriers varied from a DMD-like progression to a very mild Becker muscular dystrophy-like phenotype. Eight had exonic deletions or duplications and six had point mutations. One patient had two mutations (an exonic deletion and a splice site mutation), consistent with a heterozygous compound state. The X-chromosome inactivation pattern was skewed toward non-random in four out of seven informative deletions or duplications but was random in all cases with nonsense mutations. We present the results of DMD mutation analysis in this manifesting carrier cohort, including the first example of a presumably compound heterozygous DMD mutation. Our results demonstrate that improved molecular diagnostic methods facilitate the identification of DMD mutations in manifesting carriers, and confirm the heterogeneity of mutational mechanisms as well as the wide spectrum of phenotypes.


Assuntos
Distrofina/genética , Distrofia Muscular de Duchenne/genética , Distrofia Muscular de Duchenne/patologia , Adolescente , Adulto , Cardiomiopatia Dilatada/genética , Cardiomiopatia Dilatada/patologia , Criança , Pré-Escolar , Análise Mutacional de DNA , Feminino , Testes de Função Cardíaca , Heterozigoto , Humanos , Masculino , Pessoa de Meia-Idade , Debilidade Muscular/genética , Debilidade Muscular/fisiopatologia , Músculo Esquelético/patologia , Mutação/genética , Mutação/fisiologia , Inativação do Cromossomo X/genética , Adulto Jovem
6.
Hum Mutat ; 30(12): 1657-66, 2009 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-19937601

RESUMO

Mutations in the DMD gene, encoding the dystrophin protein, are responsible for the dystrophinopathies Duchenne Muscular Dystrophy (DMD), Becker Muscular Dystrophy (BMD), and X-linked Dilated Cardiomyopathy (XLDC). Mutation analysis has traditionally been challenging, due to the large gene size (79 exons over 2.2 Mb of genomic DNA). We report a very large aggregate data set comprised of DMD mutations detected in samples from patients enrolled in the United Dystrophinopathy Project, a multicenter research consortium, and in referral samples submitted for mutation analysis with a diagnosis of dystrophinopathy. We report 1,111 mutations in the DMD gene, including 891 mutations with associated phenotypes. These results encompass 506 point mutations (including 294 nonsense mutations) and significantly expand the number of mutations associated with the dystrophinopathies, highlighting the utility of modern diagnostic techniques. Our data supports the uniform hypermutability of CGA>TGA mutations, establishes the frequency of polymorphic muscle (Dp427m) protein isoforms and reveals unique genomic haplotypes associated with "private" mutations. We note that 60% of these patients would be predicted to benefit from skipping of a single DMD exon using antisense oligonucleotide therapy, and 62% would be predicted to benefit from an inclusive multiexonskipping approach directed toward exons 45 through 55.


Assuntos
Técnicas e Procedimentos Diagnósticos , Distrofina/genética , Distrofia Muscular de Duchenne/diagnóstico , Distrofia Muscular de Duchenne/genética , Mutação/genética , Sequência de Aminoácidos , Substituição de Aminoácidos/genética , Estudos de Coortes , Distrofina/química , Éxons/genética , Haplótipos/genética , Humanos , Dados de Sequência Molecular , Fenótipo , Polimorfismo de Nucleotídeo Único/genética
7.
Neuromuscul Disord ; 19(11): 743-8, 2009 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-19793655

RESUMO

A recurrent exon 1 nonsense mutation in the DMD gene, p.Trp3X (c.9G>A), was first ascertained in a proband with no symptoms until age 20 and who walked until the age of 62. Six other unrelated kindreds carrying a p.Trp3X mutation were subsequently ascertained, five from North America and one from Italy. In six of the seven kindreds, the proband presented in childhood incidental to elevated creatine kinase levels detected in the context of other illnesses, or in the setting of cramps with or without rhabdomyolysis. Genetic analysis by high density SNP genotyping demonstrates that the six North American families share a 3.7 Mbp haplotype surrounding the p.Trp3X allele, signifying that this is a founder mutation in these individuals. The size of the founder haplotype and the structure of shared genome-wide segments suggests that the minimal age of this mutation is >6 generations. The discovery of the first DMD founder mutation, associated with a mild Becker phenotype, suggests that the prevalence of hypomorphic dystrophin mutations should be re-examined with the use of improved genomic analysis.


Assuntos
Códon sem Sentido/genética , Distrofina/genética , Saúde da Família , Efeito Fundador , Distrofia Muscular de Duchenne/genética , Triptofano/genética , Adolescente , Criança , Pré-Escolar , Cromossomos Humanos X , Éxons/genética , Feminino , Estudo de Associação Genômica Ampla/métodos , Humanos , Itália , Masculino , Pessoa de Meia-Idade , América do Norte , Adulto Jovem
8.
Nicotine Tob Res ; 11(7): 785-96, 2009 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-19436041

RESUMO

INTRODUCTION: Previous research revealed significant associations between haplotypes in the CHRNA5-A3-B4 subunit cluster and scores on the Fagerström Test for Nicotine Dependence among individuals reporting daily smoking by age 17. The present study used subsamples of participants from that study to investigate associations between the CHRNA5-A3-B4 haplotypes and an array of phenotypes not analyzed previously (i.e., withdrawal severity, ability to stop smoking, and specific scales on the Wisconsin Inventory of Smoking Dependence Motives (WISDM-68) that reflect loss of control, strong craving, and heavy smoking. METHODS: Two cohorts of current or former smokers (N = 886) provided both self-report data and DNA samples. One sample (Wisconsin) comprised smokers making a quit smoking attempt, which permitted the assessment of withdrawal and relapse during the attempt. The other sample (Utah) comprised participants studied for risk factors for nicotine dependence and chronic obstructive pulmonary disease and included individuals originally recruited in the Lung Health Study. RESULTS: The CHRNA5-A3-B4 haplotypes were significantly associated with the targeted WISDM-68 scales (Tolerance, Craving, Loss of Control) in both samples of participants but only among individuals who began smoking early in life. The haplotypes were significantly associated with relapse likelihood and withdrawal severity, but these associations showed no evidence of an interaction with age at daily smoking. DISCUSSION: The CHRNA5-A3-B4 haplotypes are associated with a broad range of nicotine dependence phenotypes, but these associations are not consistently moderated by age at initial smoking.


Assuntos
Comportamento Aditivo/genética , Proteínas do Tecido Nervoso/genética , Receptores Nicotínicos/genética , Fumar/genética , Tabagismo/genética , Adulto , Idoso , Feminino , Haplótipos , Humanos , Masculino , Pessoa de Meia-Idade , Fenótipo , Polimorfismo de Nucleotídeo Único , Fatores de Risco , Utah , Wisconsin
9.
PLoS Genet ; 4(7): e1000125, 2008 Jul 11.
Artigo em Inglês | MEDLINE | ID: mdl-18618000

RESUMO

People who begin daily smoking at an early age are at greater risk of long-term nicotine addiction. We tested the hypothesis that associations between nicotinic acetylcholine receptor (nAChR) genetic variants and nicotine dependence assessed in adulthood will be stronger among smokers who began daily nicotine exposure during adolescence. We compared nicotine addiction-measured by the Fagerstrom Test of Nicotine Dependence-in three cohorts of long-term smokers recruited in Utah, Wisconsin, and by the NHLBI Lung Health Study, using a candidate-gene approach with the neuronal nAChR subunit genes. This SNP panel included common coding variants and haplotypes detected in eight alpha and three beta nAChR subunit genes found in European American populations. In the 2,827 long-term smokers examined, common susceptibility and protective haplotypes at the CHRNA5-A3-B4 locus were associated with nicotine dependence severity (p = 2.0x10(-5); odds ratio = 1.82; 95% confidence interval 1.39-2.39) in subjects who began daily smoking at or before the age of 16, an exposure period that results in a more severe form of adult nicotine dependence. A substantial shift in susceptibility versus protective diplotype frequency (AA versus BC = 17%, AA versus CC = 27%) was observed in the group that began smoking by age 16. This genetic effect was not observed in subjects who began daily nicotine use after the age of 16. These results establish a strong mechanistic link among early nicotine exposure, common CHRNA5-A3-B4 haplotypes, and adult nicotine addiction in three independent populations of European origins. The identification of an age-dependent susceptibility haplotype reinforces the importance of preventing early exposure to tobacco through public health policies.


Assuntos
Predisposição Genética para Doença , Proteínas do Tecido Nervoso/genética , Receptores Nicotínicos/genética , Fumar/genética , Tabagismo/genética , Adolescente , Adulto , Fatores Etários , Estudos de Coortes , Feminino , Haplótipos , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , Subunidades Proteicas/genética , Fatores de Risco , Tabagismo/etnologia , População Branca/genética
10.
Genome Res ; 14(10A): 1821-31, 2004 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-15364900

RESUMO

To promote the clinical and epidemiological studies that improve our understanding of human genetic susceptibility to environmental exposure, the Environmental Genome Project (EGP) has scanned 213 environmental response genes involved in DNA repair, cell cycle regulation, apoptosis, and metabolism for single nucleotide polymorphisms (SNPs). Many of these genes have been implicated by loss-of-function mutations associated with severe diseases attributable to decreased protection of genomic integrity. Therefore, the hypothesis for these studies is that individuals with functionally significant polymorphisms within these genes may be particularly susceptible to genotoxic environmental agents. On average, 20.4 kb of baseline genomic sequence or 86% of each gene, including a substantial amount of introns, all exons, and 1.3 kb upstream and downstream, were scanned for variations in the 90 samples of the Polymorphism Discovery Resource panel. The average nucleotide diversity across the 4.2 MB of these 213 genes is 6.7 x 10(-4), or one SNP every 1500 bp, when two random chromosomes are compared. The average candidate environmental response gene contains 26 PHASE inferred haplotypes, 34 common SNPs, 6.2 coding SNPs (cSNPs), and 2.5 nonsynonymous cSNPs. SIFT and Polyphen analysis of 541 nonsynonymous cSNPs identified 57 potentially deleterious SNPs. An additional eight polymorphisms predict altered protein translation. Because these genes represent 1% of all known human genes, extrapolation from these data predicts the total genomic set of cSNPs, nonsynonymous cSNPs, and potentially deleterious nonsynonymous cSNPs. The implications for the use of these data in direct and indirect association studies of environmentally induced diseases are discussed.


Assuntos
Exposição Ambiental , Variação Genética , Apoptose/genética , Ciclo Celular/genética , Reparo do DNA/genética , Éxons , Humanos , Sequências Reguladoras de Ácido Nucleico
11.
Am J Hum Genet ; 72(4): 931-9, 2003 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-12632325

RESUMO

Mutations in the dystrophin gene result in both Duchenne and Becker muscular dystrophy (DMD and BMD), as well as X-linked dilated cardiomyopathy. Mutational analysis is complicated by the large size of the gene, which consists of 79 exons and 8 promoters spread over 2.2 million base pairs of genomic DNA. Deletions of one or more exons account for 55%-65% of cases of DMD and BMD, and a multiplex polymerase chain reaction method-currently the most widely available method of mutational analysis-detects approximately 98% of deletions. Detection of point mutations and small subexonic rearrangements has remained challenging. We report the development of a method that allows direct sequence analysis of the dystrophin gene in a rapid, accurate, and economical fashion. This same method, termed "SCAIP" (single condition amplification/internal primer) sequencing, is applicable to other genes and should allow the development of widely available assays for any number of large, multiexon genes.


Assuntos
Distrofina/genética , Distrofia Muscular de Duchenne/genética , Sequência de Bases , Cardiomiopatia Dilatada/genética , Aconselhamento Genético , Humanos , Dados de Sequência Molecular , Reação em Cadeia da Polimerase/métodos , Análise de Sequência de DNA/métodos
12.
J Hum Genet ; 47(12): 665-76, 2002.
Artigo em Inglês | MEDLINE | ID: mdl-12522688

RESUMO

The ubiquitin ligase NEDD4L is a candidate gene for essential hypertension on both functional and genetic grounds. By targeting the epithelial sodium channel (ENaC) for degradation, NEDD4L is a significant determinant of sodium reabsorption in the distal nephron. Genetic linkage has been reported to a region of chromosome 18q harboring the gene, with phenotypes that include a rare orthostatic hypotension disorder, essential hypertension, and postural change in systolic blood pressure. A systematic search for genetic polymorphisms by resequencing exons and intron boundaries in 48 Caucasians yielded 38 variants. Among these, variant 13 is common, with either G (70%) or A (30%) as the last nucleotide of a putative exon 1. This mutation could affect the generation of a previously unrecognized splice isoform. In subsequent experiments, (1) we confirmed the presence of this putative isoform in both kidney and adrenals; (2) we established that variant 13-A leads to the systematic use of an alternative splice site, generating a transcript encoding a nonfunctional protein; and (3) we demonstrated differences in tissue-specific expression of the novel isoform relative to its previously reported counterpart. Variant 13-A precludes the formation of a transcript encoding a full-length Ca2+-dependent lipid-binding (C2) domain with very high evolutionary conservation among NEDD4L orthologs. A similar C2 domain in the paralogous NEDD4 gene plays a significant role in the transfer of its product to the apical membrane of epithelial cells. Differential function of NEDD4L isoforms could prove significant in blood pressure regulation through an effect on ENaC-dependent sodium reabsorption.


Assuntos
Processamento Alternativo , Proteínas de Ligação ao Cálcio/genética , Cromossomos Humanos Par 18/genética , Mutação da Fase de Leitura , Hipertensão/genética , Ligases/genética , Polimorfismo Genético , Ubiquitina-Proteína Ligases , Sequência de Aminoácidos , Complexos Endossomais de Distribuição Requeridos para Transporte , Éxons , Humanos , Hipertensão/fisiopatologia , Íntrons/genética , Rim/metabolismo , Dados de Sequência Molecular , Ubiquitina-Proteína Ligases Nedd4 , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Análise de Sequência de DNA , Homologia de Sequência de Aminoácidos , Distribuição Tecidual
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...